Datasets¶
This chapter will briefly review datasets for music demixing by summarizing the previous tutorial. You can find a more detailed introduction and explanation from the previous tutorial.
Data for Music Demixing¶
At a high level, the inputs and outputs of a source separation model look like this:
Fig. 1 Inputs and outputs of a source separation model.¶
MUSDB18: tutorial¶
The MUSDB18 dataset [RLStoter+17] is one of the most widely used datasets for music demixing. For example, its uncompressed version (also known as MUSDB18-HQ [RLS+19]) was the official training dataset for Leaderboard A of the MDX challenge.
This section shows how to play with the musdb package.
Frist, install musdb pacakge.
pip install musdb
After the installation, please load musdb with download=True. This will download 7 seconds sample tracks of MUSDB18.
import musdb
mus = musdb.DB(download=True)
We can use mus as a iterator.
print(len(mus))
144
Let us load the first track of the MUSDB18 dataset
track = mus[0]
print(track)
A Classic Education - NightOwl
Let us listen to the mixture (i.e., the input audio in Fig 1!)
from IPython.display import Audio, display
display(Audio(track.audio.T, rate=track.rate))
Let us listen to the output audio tracks (i.e., the targets)
for source in track.sources.keys():
print('source name: {}'.format(source))
display(Audio(track.sources[source].audio.T, rate=track.rate))
source name: vocals
source name: drums
source name: bass
source name: other
Thus, the input and output of the MUSDB18’s music demixing task are:
input:
track.audiooutput:
{source: track.sources[source].audio for source in ['vocals', 'drums', 'bass', 'other']}
Quick overview of existing datasets¶
In the MDX challenge, participants must train their system on the training set of MUSDB18-HQ dataset (or MUSDB18 dataset) for Leaderboard A. For Leaderboard B, there have been no constraints in the choice of training data. (i.e., any available datasets can be used by the participants).
Here’s a quick overview of existing datasets released after 2015 for Music Demixing:
Dataset |
Year |
Instrument categories |
Tracks |
Avgerage duration (s) |
Full songs |
Stereo |
|---|---|---|---|---|---|---|
2015 |
4 |
100 |
251 \(\pm\) 60 |
✅ |
✅ |
|
2017 |
4 |
150 |
236 \(\pm\) 95 |
✅ |
✅ |
|
2019 |
4 |
150 |
236 \(\pm\) 95 |
✅ |
✅ |
|
2019 |
34 |
2100 |
249 |
✅ |
❌ |
You can check the full list of datasets here. This extended table is based on: SigSep/datasets, and reproduced with permission.